Misclassification of class C G-protein-coupled receptors as a label noise problem
نویسندگان
چکیده
G-Protein-Coupled Receptors (GPCRs) are cell membrane proteins of relevance to biology and pharmacology. Their supervised classification in subtypes is hampered by label noise, which stems from a combination of expert knowledge limitations and lack of clear correspondence between labels and different representations of the protein primary sequences. In this brief study, we describe a systematic approach to the analysis of GPCR misclassifications using Support Vector Machines and use it to assist the discovery of database labeling quality problems and investigate the extent to which GPCR sequence physicochemical transformations reflect GPCR subtype labeling. The proposed approach could enable a filtering approach to the label noise problem.
منابع مشابه
Using random forests for assistance in the curation of G-protein coupled receptor databases
BACKGROUND Biology is experiencing a gradual but fast transformation from a laboratory-centred science towards a data-centred one. As such, it requires robust data engineering and the use of quantitative data analysis methods as part of database curation. This paper focuses on G protein-coupled receptors, a large and heterogeneous super-family of cell membrane proteins of interest to biology in...
متن کاملG-protein Coupled Receptor Dimerization
A growing body of evidence suggests that GPCRs exist and function as dimers or higher oligomers. The evidence for GPCR dimerization comes from biochemical, biophysical and functional studies. In addition, researchers have shown the occurrence of heterodimerization between different members of the GPCR family. Two receptors can interact with each other to make a dimer through their extracellular...
متن کاملCOUPLED FIXED POINT THEOREMS FOR RATIONAL TYPE CONTRACTIONS VIA C-CLASS FUNCTIONS
In this paper, by using C-class functions, we will present a coupled xed problem in b-metric space for the single-valued operators satisfying a generalized contraction condition. First part of the paper is related to some xed point theorems, the second part presents the uniqueness and existence for the solution of the coupled xed point problem and in the third part we...
متن کاملAn Effective Approach for Robust Metric Learning in the Presence of Label Noise
Many algorithms in machine learning, pattern recognition, and data mining are based on a similarity/distance measure. For example, the kNN classifier and clustering algorithms such as k-means require a similarity/distance function. Also, in Content-Based Information Retrieval (CBIR) systems, we need to rank the retrieved objects based on the similarity to the query. As generic measures such as ...
متن کاملVisual Characterization of Misclassified Class C GPCRs through Manifold-based Machine Learning Methods
G-protein-coupled receptors are cell membrane proteins of great interest in biology and pharmacology. Previous analysis of Class C of these receptors has revealed the existence of an upper boundary on the accuracy that can be achieved in the classification of their standard subtypes from the unaligned transformation of their primary sequences. To further investigate this apparent boundary, the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014